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Abstract 

Based on a study of the coupling by reflection of diffusion processes, a new mono- 
tonicity in time of a time-dependent transportation cost between heat distribution is 
shown under Bakry-Emery's curvature-dimension condition on a Riemannian mani- 
fold. The cost function comes from the total variation between heat distributions on 
spaceforms. As a corollary, we obtain a comparison theorem for the total variation 
between heat distributions. In addition, we show that our monotonicity is stable 
under the Gromov-Hausdorff convergence of the underlying space under a uniform 
curvature-dimension and diameter bound. 
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1 Introduction 

Analysis of the heat equation on manifolds or metric measure spaces is one of the central 
issues in the literature. Several topics such as analysis of partial differential equations, 
differential geometry and probability theory are interacting with each other there. As 
one of remarkable consequences of such an interaction, many different characterizations 
of the presence of lower Ricci curvature bound by means of the heat semigroup or the 
Brownian motion are revealed in [36]. Among those studies, recent developments in the 
theory of optimal transport enable us to interpret the heat distribution as a gradient curve 
of the relative entropy in the space of probability measures (see [3,35], for instance) along 
Otto's heuristic idea in [28]. This viewpoint provides a quite natural understanding of the 
fact that the presence of lower Ricci curvature bound implies a contraction property of 
heat distributions in Wasserstein distance. Significantly, this argument can bring a piece 
of implications between equivalent notions in [36] mentioned above. As its probabilistic 
counterpart, we can show the contraction by means of constructing a coupling by parallel 
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transport of Brownian motions. On the other hand, there is another kind of coupling, 
called the coupling by reflection or the Kendall- Cranston coupling, which is also well- 
studied in connection with the Riemannian geometry of the underlying space. The purpose 
of this article is to study the coupling by reflection by formulating it in terms of the theory 
of optimal transport. 

To state our result, we introduce the notion of transportation cost. Given a function 
c:MxM->lona state space M, a transportation cost T c (fi, v) between two probability 
measures \i and v on M is defined as follows: 



v) := inf / cdn, 



where n(/x, v) is the set of all couplings of \i and v, namely, a probability measure on 
M x M whose marginal distributions are \i and v respectively. The simplest case of our 
result is as follows: 

Theorem 1.1 Let M be a complete Riemannian manifold with nonnegative Ricci curva- 
ture with the Riemannian distance d. Let us define iptip) f or t,a > by 
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6XP 1 ~4i (* - f ) J- eXP l~4t l X+ 2 



dx, t > 0, 



tp t {a) := <( V47rt 

,l(o,oo)(o), t = 0. 

Then, fort > and two heat distributions /j$ (i — 1,2) generated by the Laplace- Beltrami 
operator, we have 

for any < Si < s 2 < t. 

In the full statement in Theorem 2.3, the same result as (1.1) holds for distributions 
of a diffusion process with an upper bound of dimension and a lower Ricci curvature 
bound in the sense of Bakry and Emery with an appropriate choice of <pt(o>), which is 
more complicated. Alternatively, (1.1) could be formulated as a non-expansion result of 
Lipschitz constants with respect to time-dependent metrics <Pt(d); see Theorem 6.1. This 
allows us to interpret tp t as the profile of the "worst case" initial data corresponding to 
(fo(d). Given K e M, the //-contraction in Wasserstein distance mentioned above means 

e^r dP (^\^) < e pKs T dP (^\^) (1-2) 

for t > s > and any heat distributions fiy (i = 1,2). It holds with K = under 
the assumption in Theorem 1.1 and hence Theorem 1.1 can be regarded as an analogue 
of it. Indeed, the only difference between them is the choice of the cost function. The 
counterpart of Theorem 6.1 for (1.2) is the equivalence with Bakry-Emery's L 9 -gradient 
estimates (see [21]). 

Let us review the history on the study of coupling by reflection, to explain a meaning 
and significance of Theorem 1.1. We call (Xi(t), X 2 (t)) a coupling of a diffusion process 
X(t) on a state space M if (X±, X 2 ) is a stochastic process on M x M and each Xi behaves 
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as X on M for i — 1,2. The coupling by reflection on a Euclidean space, or the mirror 
coupling, of Brownian motions introduced in [24] is given by the global reflection with 
respect to the hyperplane bisecting the line segment joining initial positions. With the help 
of Riemannian geometry, a coupling by reflection of Brownian motions on a Riemannian 
manifold is constructed by Kendall [15] and Cranston [6], by making a coupling of their 
infinitesimal motions. In many applications, it is nice to suppose that they will coalesce 
after the coupling time, namely, the time when they meet. As a matter of fact, the 
coupling by reflection of Brownian motion on a Euclidean space, or more generally the 
one on a Riemannian manifold with nonnegative Ricci curvature, can meet in a finite time 
almost surely regardless of the dimension of the space. It is a great contrast with the case 
of observing two independent Brownian motions. Under a nice condition, for example, 
the presence of curvature bounds on the state space, this kind of coupling has provided 
several applications e.g. in estimating the rate of convergence to equilibrium, functional 
inequalities involving heat semigroups, (non-) existence of harmonic maps (see [16] and 
references therein, for instance). As a simple example, the coupling by reflection under 
nonnegative Ricci curvature easily implies the Liouville property, that is, non-existence 
of nonconstant bounded harmonic functions. In many of those applications, we only need 
to know the existence of a coupling ir of two distributions of X(t) having a good estimate 
of 7r({(x,x) I x G M}), which can be provided by comparing the transportation cost in 
Theorem 1.1 at s — t with that at s = since tpo — l(o,oo)- Thus the monotonicity of 
transportation cost in Theorem 1.1 works sufficiently well in applications. 

Recently, such topics as mentioned above is extensively studied on more singular metric 
measure spaces than Riemannian manifolds under a new, synthetic notion of curvature 
bounds (see e.g. [5,25,31,32]). However, the traditional way of studying a coupling by 
reflection of Brownian motions is based on the theory of stochastic differential equations 
and hence there are many difficulties to extend the original argument directly into analysis 
on such singular spaces. In contrast to such an approach, the statement of Theorem 1.1 
completely makes sense even on singular spaces once we introduced the notion of heat 
distributions on it. Thus there seems to be some possibility to extend it to such cases 
though our framework in Theorem 1.1 or Theorem 2.3 is still on a Riemannian manifold. 
Actually, we obtain a partial result in this direction by showing that the monotonicity 
of the transportation cost stated in Theorem l.l(or Theorem 2.3) is stable under the 
Gromov-Hausdorff convergence of underlying spaces (see Theorem 7.2) with a uniform 
curvature- dimension and diameter bound. 

One might wonder why the cost function in Theorem 1.1 appears. It is based on the 
fact that Tip t _ s (d)(l^s i /4 ) is a constant function in s and the infimum in the definition of 
the transportation cost is attained by the coupling by reflection when M is a Euclidean 
space [12,19]. Our choice of the cost function is natural and sharp in this sense. Our 
argument in the proof of Theorem 1.1 is based on a comparison between the distance 
process for the coupling by reflection on M and the one on a Euclidean space. And 
then the sharpness on a Euclidean space plays a prominent role when we deal with the 
comparison process. It should be remarked that, when M is a Euclidean space, the 
cost function <p t (d(x, y)) coincides with the total variation between two heat distributions 
at time t with initial distributions 6 X and 5 y respectively. This fact is closely related 
to the maximality of the coupling by reflection of Brownian motions on a Euclidean 
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space (see [12, 19], for instance). Indeed, by choosing the cost function as the total 
variation between heat distributions, exactly the same constancy holds on a spaceform 
since the coupling by reflection of Brownian motions is also maximal. As we will see, 
this characterization of our cost function leads to the following comparison theorem for 
the total variation between heat distributions. Let us denote the total variation norm by 
II ' IItv- 

Corollary 1.2 Let M be a complete Riemannian manifold whose dimension is less than 
or equal to N £ N and whose Ricci curvature is greater than or equal to K £ R. Then, 
for two heat distributions , fig with //q = fix, for some X\ £ M (i = 1,2), 



(1) (2) 



< 

TV 



pf' N (xx,y) -pf' N (x 2 ,y) vo\ Mk (dy), (1.3) 



where pf ,N (x, y) is the heat kernel on the N -dimensional spaceform Mx,7v of constant 
sectional curvature K/(N — 1) and (xi,x 2 ) is any pair of points in M KtN satisfying 
d(x 1} x 2 ) = d(xi, x 2 ). 

This is a special case of Corollary 2.4 below. It seems to be natural that we can measure 
the total variation as a result of a study of the coupling by reflection since the coupling 
by reflection has been strongly related with estimates involving the coupling time, which 
yields an estimate of the total variation between distributions via the coupling inequality 
(see [23], for instance). 

Note that, to the best of the authors' knowledge, an estimate of type (1.1) with the 
use of a time-dependent cost function is studied first in [30, Example 4.6]. While it is 
discussed only on M m , it includes Levy processes as an example. Also note that our 
cost function in Theorem 1.1 (or Theorem 2.3) is a concave function of the distance 
function. It corresponds to the observation in [9], which says that the optimal transport 
map for a concave cost function reverses the orientation. Indeed, the reflection map used 
in constructing the coupling by reflection does so. Finally, we remark that there is a recent 
related result in [7] , which studies a behavior of the transportation cost of a concave cost 
function in connection with the coupling by reflection of a diffusion process on R m . 

The organization of the paper is as follows. In the next section, we will give a more 
precise statement of our main results. For proving them, we will study the coupling by 
reflection in section 3. There we will follow the argument in [17,20] in which we construct 
the coupling by reflection via an approximation of diffusion processes by geodesic random 
walks. It might be possible to follow an alternative approach in [38] . Section 4 is devoted 
to show several regularity properties of the function (p^' N (a) introduced in section 2 to 
describe the main theorem. Some explicit expressions of if^' N (a) as well as asymptotic 
behavior as t — > or t — > oo are also given there. Some results in this section might be of 
independent interest. The proof of our main theorem is given in section 5. Though most 
part will follow from the result in section 3, we need an additional argument with the aid 
of results in section 4 to complete the proof. We also study new monotonicity formulae 
for time-independent transportation costs (Corollary 5.3) as a consequences of the main 
theorem and results in section 4. In section 6, we give a short remark on gradient estimates 
for the diffusion semigroup corresponding to our main theorem. Though a similar gradient 



4 



estimate is already obtained in [6] in the same spirit, what we obtained is sharper in 
many respect. The stability of our main result under the Gromov-Hausdorff convergence 
is discussed in section 7. It ensures that all the results obtained before this section will be 
inherited to the measured Gromov-Hausdorff limit under a uniform curvature-dimension 
and diameter bound. In section 8, we will give a brief comment on the extension of 
results in sections 2-6 to the time-dependent metric case. Note that the assumption there 
is satisfied if the metric evolves according to the backward Ricci flow. 



2 Framework and the main result 



Let (M, g) be a complete m-dimensional Riemannian manifold with m > 2. Let d stand 
for the Riemannian distance on M. Let Z be a C 1 -vector field and we denote the generator 
of the form A + Z by C, where A is the Laplace-Beltrami operator with respect to g. 
Let ((X(t))te[o,oo), (Px)xgm) be a diffusion process associated with C Let (VZ) b be the 
symmetrization of VZ, i.e., a (0, 2) -tensor given by 

(VZ)\X, Y) := ~ ((VxZ, Y) + (VyZ, X)) . 

Our basic assumption is the following condition involving the upper dimension bound and 
lower Ricci curvature bound formulated in terms of L: 

Assumption 1 Given K G M and N G [m, oo], the following holds: 

Ric -iVZf - — - — Z ®Z> Kg. 

y ' N-m 

Here we regard the third term in the left hand side is when N = oo, and N = m is 
permitted only when Z = 0. 

It is well known that Assumption 1 is equivalent to the following curvature-dimension 
condition of Bakry and Emery (see e.g. [4,22]): 

~ (£{Vf, V/) - 2(V/, V£f}) > K(Vf, V/) + ^(£/) 2 - 

This condition is equivalent to dim M < N and Ric > K when Z = 0. 

In order to state our main theorems, we introduce the notion of comparison process 
and associated transportation costs. Let K G 1R and G [2, oo]. Set R = Rr,n by 



N - 1 

7r if K > and N < oo, 



Rk,n '■— \ V K 

oo otherwise. 
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We define sk and ck as a usual comparison function as follows: 

t 1 



sm(VK6) 



s K {9) ■= { 




K > 0, 
K = 0, 



smh(V^K8) K < 0, 



K > 0, 
K = 0, 



cosh( v /z ^^) if < 
and tx '■= s K /c K . Let \I/ = ^.^jy : (— -R, -R) — >■ K be given by 

-2Kt K/{N - X) (|) if iV < oo, 



otherwise. 



Let us define a diffusion process = PK,N,a(t), t > on (—R,R) clasa solution to 
the following stochastic differential equation: 

dp K ,N,a(t) = 2V2dP(t) + y(p K>N>a (t))dt, (2.1) 

PK,N,a(fy = a - 

Note that, when R < oo, both — R and .R are entrance boundary for p(t). For t > 0, let 
us define >7V : [0, R) -> [0, 1] by 



K,N ( \ 



P 



inf p K ,N,a( S ) > 



Remark 2.1 (i) The process px,N,a comes from the coupling by reflection on the space- 
form. Actually, when N eN, a simple computation implies that the distance process 
d(K(t)) for the coupling by reflection X(£) = (Xi(t), -^(t)) of Brownian motions on 
the spaceform Mjv.a: solves the stochastic differential equation defining px,N,a with 
a = d(X(0)). 



(ii) Since —pK,N,a has the same law as pK,N,-a, the reflection map x H- — x on (-R, R) 
provides a so-called 'reflection structure' in [19]. It is shown in [19] that the mirror 
coupling for px,N,a an d Pk N -a is maximal in such a case. As a result, we have 



<pf> N (a) = ||Po (pK^ait)^)- 1 -Po (pKM-atyfi)- 1 ] 



TV 



In particular, we can easily verify that (p® ,N equals to (p t in Theorem 1.1. Moreover, 
E[</? t _ s (|p(s)|)] is a constant function in s G [0,t] by [19, Lemma 3.4]- 

(iii) When N e N, the coupling X(t) by reflection of Brownian motions B(t) on ~M.r,n 
is maximal by the same reasoning (see [19, Theorem 5.1 and Example 4-6])- Thus 
we have 

rf' N {a) = ||P* o Bit)- 1 - F £2 o Bit)- 1 ^ 
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for any pair of points (xi, x 2 ) in M.k,n satisfying d(xi, X2) = a and K[<p t _ s (d(X(s)))] 
is a constant function in s G [0,t]. In particular, the right hand side of (1.3) equals 
to (pf' N (d(x 1 ,x 2 ))- 

Now we are in turn to state our first main theorem as follows: 

Theorem 2.2 Suppose that Assumption 1 holds. Then, for any x±,x 2 G M, there exists 
a coupling X(t) = (Xi(t),X 2 (t)) t >o of C- diffusion processes starting from (21,22) such 
that, for any t > and s > 0, 



E 



^' N {d{X{s))) <(pf^d(x!,x 2 ). 



Indeed, as we will see, a coupling X(t) appeared in Theorem 2.2 will be given as the 
coupling by reflection. Theorem 2.2 yields the following corresponding property described 
in terms of 7^ t (<f). This is our second main theorem: 

Theorem 2.3 Suppose that Assumption 1 holds. For % — 1,2 and fv- 1 ' G V(M), let 
fit be the distribution of X(t) with the initial distribution ft® . Then, for any t > 0, 
Ttp t _ s (d){l^s \ l^s ) is a nonincreasing function of s G [0, £]. That is, for < si < s 2 < t, 

T^wili&fiS) < % t _ Bi{d) (^,^). (2.2) 

As a result of Theorem 2.3, we can compare T Vt _ a {d){^s l \ fif 1 ) at s = t with the one at 
s = to obtain an estimate of the total variation between distributions of the diffusion 
process X(t). In particular, when fi^ and Dirac measures, we obtain the following 

comparison theorem thanks to Remark 2.1 (ii): 

Corollary 2.4 Suppose that Assumption 1 holds. Then, for 21,22 G M and t > 0, 



^oxity 1 -F^oxity 1 

< ||P O (PKWW)®/ 2 )' 1 ~ P ° (pK,N,-d( Xl , X2 )(t)/2) 



TV 

-1 1 



I TV 



When N G N, it immediately implies Corollary 1.2 by virtue of Remark 2.1 (iii). 

Note that, by taking t — > 00 in (2.2) after a suitable rescaling, we can obtain a similar 
monotonicity formula whose cost is independent of t. See Corollary 5.3 below. Especially, 
when K < 0, it does not seem to be known in the literature. 

Remark 2.5 When K > 0, it is shown in [18] that under Assumption 1 the Bonnet- 
Myers type diameter bound 

In - 1 

diam(M) < ttW — 

holds. Moreover, the equality holds only when N = m, Z = and M is isometric to 
N -dimensional sphere of constant sectional curvature K/(N — 1). In the case of equality, 
the assertion in Theorem 2.2 is obvious by Remark 2.1 (iii) and hence we may assume 
diam(M) < n^(N - TjjK in the sequel. 
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3 Proof of Theorem 2.2 



We will show that the coupling by reflection studied in [20] (cf. [17]) satisfies the assertion 
of Theorem 2.2 under Assumption 1. We begin with reviewing the construction of the 
coupling by reflection. Let (£ n )neN be independent random variables all of which are 
uniformly distributed on the unit disk on M. m . Let ('j X y)x,yeM be a measurable family of 
unit-speed minimal geodesies defined on [0, d(x, y)] such that ^ xy joins x and y. Without 
loss of generality, we may assume that j xy is symmetric, that is, j xy (d(x,y) — s) — j yx (s) 
holds. Let us define m xy : T X M — > T X M by 

m xy v :=v- 2{v,j xy (0))% y (0). 

This is a reflection with respect to a hyperplane which is perpendicular to j xy . Let 
// 1 be the parallel transport along a curve 7. Let us define m xy : T X M — > T y M by 
mxy '■= //y xy 'fhxy Clearly m xy is an isometry. Set D(M) := {(x,x) \ x G M}. Let 
$ : M — > £?(M) be a measurable section of the orthonormal frame bundle €?(M) of M. 
Let us define two measurable maps $j : M x M — > ff(M) for i = 1, 2 by 

$i{x,y) := $(x), 

'm xy ^x{x,y), (x,y) G M x M\D(M), 
$(x), (x,y)ED(M). 



<f> 2 (x,y) ■-- 



Take x\,x 2 G M. Let := a 2 n for n G No- By using $ i; we define a coupled geodesic 
random walk X a (i) = (X"(t), X^(t)) with a scale parameter a by X"(0) = Xj and, for 



C +1 := V2(m + 2)^ (X a (t°)) £ n+1 , 



for z = l,2, where exp^ is the exponential map at x. Let us denote C([0, 00) — >• M x M) 
and C([0, 00) — > [—R,R]) equipped with the topology of compact uniform convergence 
by and ^1 respectively. 

In what follows, we assume Assumption 1. Then, by [17, Theorem 3.1] (also see 
references therein), Xf(t) converges in law in C([0, 00) — > M) to an /^-diffusion process 
starting from Xj for i — 1,2 respectively. Thus (X Q ) Q , >0 is tight and hence a subsequential 
limit X afc — > X = (X 1 ,X 2 ) in law in ^ exists. We fix such a subsequence (a^keN- in 
the rest of this paper, we use the same symbol X a for the subsequence X° fc and the term 
"a — > 0" always means the subsequential limit "a^ — > 0" . Let r* be the first hitting time 
to D(M) of X. Then we define a coupling by reflection X* = pT*,X|) by 



X*(t) := 



X(t) if t < r* 

(X 1 (t),X 1 (t)) iit>T* 



Since r* is a stopping time with respect to the filtration generated by X, and Xj (i = 1, 2) 
is a solution to the martingale problem associated with the same filtration, X* is again a 
coupling of /^-diffusion process. 



S 



Fix a reference point o G M. For R > 0, let or : — > [0, oo] be given by 0^(10) := 
inf {t G [0, oo) | w(t) > R}. We define a R (i = 1, 2) and a R by ^ := a R (d(o, X? (•))) and 
o"_r : = 0"^ ^ ^R- Proposition 3.4 in [17] says that 

lim limsupP^fl < oo] = (3.1) 

holds. 

We next review a difference inequality of d(X. a (t)). To describe it, we will introduce 
some notations. For simplicity of notations, let us denote 7x« (t%)x%(t%)i m x^{t%)x%(t%) an d 
d(X. a (t%)) by 7„, m n and r a {n) respectively. Let £^ +1 (0) be the orthogonal projection of 
to the hyperplane being perpendicular to 7n(0), that is, 2^ +1 (0) := (l+m n )^ +1 . We 
denote a vector field along 7 n given by parallel transport of £^ +1 (0) by {£,n+i( s ))s<z[o,r a {n)\- 
Let us define a weight function h n+ \ = h^^[ on [0,r a (n)] and a vector field V^ +1 = V^+i 
along 7„ by 



-i 



c K /(n-i) y-^r~ ) c k/{n-i) (I — 2 — ) J if N < °°' 

1 if N = oo, 



r Q (n) 



K+f (s) := /^h-i( 5 )S+i(*)- 



Recall that we are assuming diam(M) < ii\J (N — 1)/K when K > and N < oo (see 
Remark 2.5). Hence /i„+i is well-defined. For a smooth curve 7 and vector fields V and 
W along 7, we denote the index form by I^(V,W). When V = W, we use the symbol 
Ij(V) for I 7 (V, W). Take v G M m . Let us define A n+i and A n+i by 



'2v^2<e +1 (0), 7 n(0)) if X«(£) £ D(M) } 
2^j2~y/rn+~2(£ in+ i,v) otherwise, 



A n+ i := ( (Z(^), 7n (s))|^ ( n) + J 7n (K+i, 



For 5 > 0, let us define t s : ^ -> [0, 00] by r 5 (w) := inf {i > | w(t) < 5}. We also 
define t$ by fg := r,5(<i(X a (-))). In the sequel, we fix 5 G (0, 1) and R > 1. The first goal 
is to prove the following difference inequality for r a (n): 

Proposition 3.1 For each e > 0, there exists a family of events with lim Q ^ ^t^"] = 
1 such that 

r a (n + 1) < r a (n) + a\ n+1 + a 2 ^(r a (n)) + so? 
holds for n G N with t% < t$ A a R on (E^) c for sufficiently small a. 

We will prove this assertion by a similar argument as in [17,20]. Thus we only give a brief 
sketch of arguments. It consists of the following three lemmata. The following is shown in 
the same way as [20, Lemma 3] or [17, Lemma 4.4] by using the second variation formula 
of arclength with a careful treatment of singularities arising from the cutlocus. 
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Lemma 3.2 For n e Nq, we have 

r a (n + 1) < r a (n) + a\ n+1 + a 2 A n+1 + o(a 2 ) (3.2) 

when n < fg A <tr and a is sufficiently small. Moreover, we can control the error term 
o(a 2 ) uniformly in the position o/X Q . 

Set ^ n := cr(£i, . . . , £ n ) and A n+1 := E [A n+1 | & n ]. For £ > and R > 0, let us define an 
event E" by 

n 



sup ^T(A,-A,)< 



By following arguments in [20, Lemma 6] or [17, Lemma 4.5] which are based on the Doob 
submartingale inequality, we obtain the following. 



Lemma 3.3 For any e > and R > 0, F[E°] tends to 1 as a — > 0. 

Lemma 3.3 ensures to replace A n+ i in Lemma 3.2 with A n+ i with small errors on (E t 
Thus, the proof of Proposition 3.1 will be completed with E* = E£ once we show the 
following: 

Lemma 3.4 A„ + i < *(r a (n)). 
Proof. Note that we have 

(^TnOO)!^ = h n+1 ( S f(Z,Us))\ r s y 
r a (n) 

'h n+1 (s) 2 (VZf(%(s),%(s)) 

+ 2h' n+1 (s)h n+1 (s)(Z,j n (s)))ds. (3.3) 



By an easy computation, we obtain E[£i] = and Cov(a/2(?ti + 2j£i) = 2Id. Thus we 
have 

rr a (n) 

K(Vn + i)= / {(m-l)h' n+1 (s) 2 -m C (%(s),Us))h n+1 (s) 2 )ds. (3.4) 







Combining (3.3) and (3.4) with the definition of A n , we obtain 

A n+1 := / (h n+l (s) 2 (yzf (7„(a),7n(a)) + 2h' n+1 (s)h n+1 (s){Z,j n (s)) 



+ (m - l)h' n+1 {s) 2 - Ric ( 7n (s), 7n (s)) h n+1 (s) jds l{x«(tg)^£)(Af)}> (3-5) 

Thus, when A r = oo, the conclusion easily follows from Assumption 1. When N = m, 
Z = holds and Assumption 1 means Ric > K. Thus an easy computation in (3.5) yields 
the conclusion. When m < N < oo, the arithmetic geometric mean inequality implies 

2h' n+1 (s)h n+1 (s)(Z,%(s)) < {N-m)h' n+1 {s) 2 + 1 ^—h n+l {s) 2 {Z,%{s)) 2 

= (N- m)h' n+1 (s) 2 + —!—h n+1 (s) 2 Z ® Z(j n (s), 7n (s)). (3.6) 

IN — m 
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By substituting (3.6) into (3.5), we obtain 
An + i< ( / ((N - l)h' n+1 (s) 2 



+ h n+ i(s) 2 ^ — j- ® Z + (VZ) b - Ric^ (7n(s),7n(s)))ds^l{x«(tg)^£>(Af)}- 

Hence Assumption 1 reduces the assertion to the same computation as in the case N = m. 
□ 

Set a : = d(xi,x 2 )- Let p\ N a (t) be a discrete approximation of pK,N,a(t) defined 
inductively by PK Na (0) = a and for t G [i", 

PK,N,a{t) ■= PK,NM) + i ~^ L (« A «+1 + ^(p a K,N,aK))) ■ 

For i? > 0, let us define i?* > 1 by 

. R - 4 if K > and A r < oo, 
R := { R 

R otherwise. 

The following comparison theorem is crucial for the proof of Theorem 2.2. 

Proposition 3.5 For T > 0, R > and s > 0, there exists a constant C(e,T) > 
satisfying lim £ ^ C(e, T) = such that 

d(Xf)<p a K ^ a (t) + C(e,T) 

holds for t < f<5 A <tr A ctr* (p% Na ) AT on (E") c for sufficiently small a. 

Proof. By [17, Corollary 3.6(i)], it suffices to show the assertion only when t — for 
some n < (cf. [17, Lemma 3.10]). For simplicity of notations, we denote PK,N,a(^n) 
by p a (n). Applying Proposition 3.1, we obtain 

r a (n + 1) - p a {n + 1) < r a (n) - p a (n) + a 2 (*(r a (n)) - *(p a (n))) + so? (3.7) 

for n < with < t$ A A Cji*(p < f C N ) on E 1 ". Under our assumption on i = 
r a (n) G [6,R] and p a (n) G [0,if ] hold. Note that * is bounded on [0, diam(M) A R*}. 
Let / a : R — > R be a function of class C 2 satisfying the following conditions: 

(i) /«( a; ) — for x < and / a (x) = x + a/2 for x > a, 

(ii) f a is convex, 

(iii) limsupa 2 sup /"(«) < C for some C > 

a->0 «gM 
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(cf. the proof of [17, Lemma 3.10]). By (3.7), the Taylor expansion together with the 
condition (iii) of f a yields 

f a (r a (n + l)-p°(n+ 1)) < f a (r a (n) - p a (n)) 

+ a 2 f' a (r a (n) - p») (*(r») - #(p»)) + 2ea 2 (3.8) 

for sufficiently smaller a than e. Since \& is nonincreasing, properties (i) and (ii) of f a 
imply 

f' a (r a (n) - p») (¥(r») - *(p»)) < 0. 
Thus, an iteration of (3.8) together with the fact f a (x) + a/2 > x V yield 

(r» - p») + < f a (r a (n) - p») + | < 2ea 2 n + e 

for a < 2e. Since t" = a 2 n < T, the conclusion follows. □ 

Now we are in position to give a crucial step of the proof of Theorem 2.2. 

Proposition 3.6 For any nondecreasing continuous function if) : [0, R) —> [0, 1] with 
ip(0) = 0, we have 

E^(d(X*(s)))}<E[^(p(s)); r (p)>s}. 

Proof. Take S > 0, R > 1 and t > s. Let e > be so small that C(e,t) < 5/2. By 
virtue of Proposition 3.1, for sufficiently small a, 

E[i;(d(X. a (s)))] < E [iP(d(X. a (s))) ; {fa > s} H {a R > s} fl (E°) c ] 

+ n°R <s]+E [^(rf(X Q (s))) ; f 5 < s] + e. (3.9) 

By Proposition 3.5 and the choice of e, 

E [i;(d(X. a (s))) ; {fa > s} n {<7 B > s} n (£ e Q ) c ] 

<E^(p Q ( S ) + C , (e,t)); r 5/2 (p Q )A^(p Q )>s]+P[a R *(p a )<s]. (3.10) 

Let us define * : [0, oo) ->• R by 

*(«) : = (*(«) a mmii) v (-i*(-(2flni). 

We define p a and p by replacing \I/ with \1/ in the definition of p a and p respectively. Since 
ijr(it) = \&(u) for u G [0, i?*], we obtain 

E[ii(p a (s) +C(s,t)) ; r 5/2 (p a ) A % (/, ft ) > s] 

< E [^(p a (s) + C(e,t)) ; r 5/2 (p a ) > s] , (3.11) 
P[^(p°) < s] = F[a R ,(p a ) < s}. (3.12) 

Since ^ is bounded and continuous, we can easily show that p a converges in law to p in 
C([0, oo) — >■ R). Note that the following holds: 

{w G # ; r 5 / 2 (d(w)) > s} C {w G # ; r s/4 (d(w)) > s} . 
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By combining this fact with (3.11), the Portmanteau theorem together with (3.10), (3.11) 
and (3.12) yields 

limsupE [^(X a ( S ))) ; {r 5 > s} n {a R > s} n (E«) c ] 

< E fy(p(s) + C(e,t)) ; r 5/4 (p) > s] +F[a Rr {p) < s}. (3.13) 
In a similar way as (3.11) and (3.12), we obtain 

E y,{p(s) + C(e,t)) ; r s/4 (p) >s}+ F[a R *(p) < s] 

< E [^{p{s) + C{e,t)) ; r 5/4 (p) > s] + 2P[<r fl .(p) < s]. (3.14) 

Here we used the fact ip < 1. Since X Q converges in law to X in by applying the 
Portmanteau theorem to (3.9) together with (3.13) and (3.14), we obtain 

E[^(d(X( S )))] = limE[^(rf(X a ( S )))] 

a— >0 

< E fy(p( s ) + C(e,t)) ; r 5/4 (p) > s] + 2P[^,(p) < s] 

+ limsupP[<r i? < s] +E[<0(cZ(X(s))) ; r«(d(X(-))) < s] + e. 

a-s>0 

By letting £ — >■ in this inequality, we obtain 

E[#J(X(a))); r f (rf(X(-)))> S ] 

< E [i/>(p(s)) ; T 5/A {p) >s}+ 2F[a R *(p) <s} + limsupPfo < s}. (3.15) 

a->0 

By the definition of X* and r*, we have 

limE[^(d(X(a))) ; r 5 (d(X(-))) > s] = limE[^(d(X*(s))) ; r s {d{X* (■))) > s] 

<5->0 5 — >-0 

= E[^(X*(s))); r*>s] 

= E[#i(X*(s)))]. (3.16) 

Here the last equality follows from V'(O) = 0. Similarly we obtain 

limE [ip(p(s)) ; r s/4 (p) > s] = E ty>(p(s)) ; r (p) > s] . (3.17) 

Thus, by combining (3.15) with (3.16) and (3.17) and by tending R — > oo with (3.1) in 
mind, we obtain 

E^KX*( S )))]<E[^(p(s)); r (p)>s]. 
Here we used the fact that p cannot hit R in finite time. Hence the assertion holds. □ 

To complete the proof of Theorem 2.2, we will use a regularity result on ip t in the next 
section. As you will see, all the arguments in the next section are independent of this 
section. Thus there are no danger of circular arguments. 
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Proof of Theorem 2.2. By virtue of Proposition 4.5 (ii) below, we can apply Propo- 
sition 3.6 with ip = tp t . Thus we obtain 

E[<p t (d(X*(s)))} < E[<p t (p(s)) ; r (p) > s}. (3.18) 
Since -Pk,n,o = Pk,n-o. holds, a process p* = (p (1) ,p (2) ) given by 

PK,N,a(t) PK,N,a(t) 



p*(t) :-- 



2 2 

PK,N,a(t) PK,N,a(t) 



if t < To(p K ,N,a) 
if t > T (p KiN:a ) 



is a coupling of PK,N,a/2 and pK,N-a/^- Since the reflection map x i-> —x on (-R/2, R/2) 
provides a reflection structure for pK,N,a/2 in the sense in [19], p* is a maximal coupling 
of pi<,N,a/2 and p KjN _ a /2, and T (p KtNja /2) = T (p K ,N, a ) is the coupling time of p*. Thus 
Remark 2.1 (ii) yields 

cp t (\p*(s)\) = cp t+s (a). (3.19) 

Since the definition of p* implies 

E[^(p(s)); r (p)>s}=E[ ift (\p*(s)\)}, 
the combination of it with (3.18) and (3.19) deduces the conclusion. □ 

4 Properties of the cost function 
Let us define x '■ [0, oo] — > [0,1] by 

X (r) := -L T e- 2 ^^ 



and = 1- We can easily verify that x i s increasing and concave. The first goal of 

this section is the following expression of (ft{o): 

Proposition 4.1 For each N G [2, oo], K 6 R and £ > 0, there exists a probability 
measure (t,K,N on [0, oo) such that 



[o,oo) \2v2m 

holds for each a G [0, oo). In addition, we can take Ct,K,N so that it is continuous in t with 
respect to the topology of weak convergence. 

The expression (4.1) will be used to study some properties of <p<( a ) m Proposition 4.5. 
We divide the proof of Proposition 4.1 into the following two lemmata; Lemma 4.2 when 
iV = oo or K = and Lemma 4.4 when N < oo and K ^ 0. We will give an expression 
of Ck,n there. 
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Lemma 4.2 Suppose N = oo or K = 0. Then 

<Pt(a) = X 

where rj(t) = r] K (t) is given by 



a 



{ e 2Kt _ 1 
t K = 0. 

In particular, Proposition 4-1 holds with ( t ,K, oo = C*,o,jv — ^(t)- 

Proof. In this case, pt := e~ Kt a + 2v2 / e K ( s ~^d(3 s holds. By the martingale repre- 

Jo 

sentation theorem, j Q e Ks df3 s can be written as a deterministic time-change of a standard 
one- dimensional Brownian motion. By using this fact together with the expression of the 
hitting time distribution of the Brownian motion, the desired expression of tpt follows. □ 

To consider the case N < oo, we begin with the following auxiliary lemma: 

Lemma 4.3 Suppose N < oo. Let /?*(£) be the standard one- dimensional Brownian mo- 
tion fori = 1,2. Let 8(t) be the solution to the following stochastic differential equation: 

d9(t) = V2dp\t) + ( tA7( ^~ ( g (t)) - j^zjtK/( N -i)(d(t)yJ dt, 
6{0) = 0. 
Let us define H(t) by 

Eff) := 2 3 -; (w _ 1) (c w _ t) (,(,)) s K/(K . 1} (f + ^gfey )) ■ («) 

T/ien S /ias t/ie same law as p. 

The alternative expression of p in the last lemma comes from a skew-product expression 
of the distance between two Brownian motions coupled by reflection on a sphere. For 
explaining a heuristic idea behind it, we assume ]V G N, K = N — 1 and Z = for a 
while. We identify the sphere of constant sectional curvature 1 with an unit sphere 
in R N+1 as a submanifold. Let if be a (uniquely determined) 2-dimensional plane in 
]R Ar+1 containing origin and given starting points of the coupling of Brownian motions by 
reflection. Then we can decompose the Brownian motion on into the "circular part" , 
that is, the projection to H and the "complementary part", that is, the projection to H^. 
As a result, we can describe the distance between the two Brownian particles coupled by 
reflection by the scaled distance between two time-changed Brownian motions coupled by 
reflection on a circle whose space scaling and clock process are given by functionals of the 
complementary part. This description leads us to the expression in Lemma 4.3. Moreover, 
once we obtained this expression, we can verify it valid even when N ^ N or K < as we 
will see in the following proof of Lemma 4.3. For the skew product expression of spherical 
Brownian motions, see [14,29], for example. 
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Proof of Lemma 4.3. For simplicity of notations, we denote K/(N — 1) by K in this 
proof. Let E(t) := s^-(S(t)/2) and p{t) := s^(p(t)/2). It suffices to show that both E(t) 
and p(t) solves the following stochastic differential equation 



dz(t) = yjl- Kz(t) 2 dw(t) - ^—z(t)dt (4.3) 



NK 

Y 

for a standard Brownian motion w(t). By the Ito formula together with (2.1), we can 
easily verify that p solves (4.3) with w(t) = (3(t). The Ito formula together with (4.2) 
yields 

^ = -^*W«»«(5 + ^jf5S))*W 



dfi 2 (t) 



\2 J c K {9{s)) / 

Here we used the relation Cx{r) 2 + Ks^{r) 2 = 1, which holds for any K 6 M, to obtain 
the last equality. By a direct computation, we have 

Note that 1 — KS(t) 2 > holds for any t > almost surely since 6(t) never hits R/2. 
Thus, H solves (4.3) with = /?*(£) given by 



K~(t) 2 



and hence the conclusion follows. □ 

Lemma 4.4 Suppose N < oo. VFe denote the law of f* CK/{N-i){9{s))~ 2 ds by Ct,K,N f or 
each t > 0, where #(■) is as in Lemma 4-3. Then the conclusion of Proposition 4-1 holds 
true. 

Proof. The continuity in t of Ct,K,N directly follows from the definition. Let S be 
as in Lemma 4.3. By the martingale representation theorem, there exists a standard 
one- dimensional Brownian motion B(t) such that 

Jo C K /{N-l){0(s)) 2 J J C K /(N-l)(0(s)) 
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holds. Since /3 1 and /3 2 are independent, B(-) behaves as a standard Brownian motion 
even under the conditional probability measure P[ ■ jo^/? 1 )]. Thus the definition of (pt{a) 
and Lemma 4.3 yield 



<Pt(a) = P 
= P 
= E 

= E 



inf p(s) > 

0<s<t 



P 



inf ( a - + V2( 

o<s<t \2 J 



df3 2 {u) 



(a 






< s < 2 / 




Jo 



Cjf/(jv_ 1 )(0(u)) 



> 



> 



P 



A 



< s < 2 
du 



du 



o c^ /(7V -i)(#(w)) 2 

1/2- 



> 



JO C K/{N-1) 

Hence the desired result holds. 

Now we state some consequences of the expressions of (pt{o) in Proposition 4.1: 
Proposition 4.5 (i) For a G [0, R], [0, oo) 3 1 1— > <pt( a ) is continuous. 



□ 



(ii) (ft is continuous on [0,R) and smooth on (0,R) fort > 0. 



(iii) (ft is concave on [0, R) for t > 0. 

(iv) For t > 0, K, K' e R with K > K' , N, N' e [2, oo] with N < N' and a e [0, R K>N ] 

K,N / \ ^ K',N' i \ 

(v) Fort > 0, K G M and N G [2, oo], <fif' N is differentiable at 0. Moreover, for K' El 
and N' E [2, oo] with K' < K, N' > N, 



(f?> N )'(0) 



[0,oo) 



Ct,K,N{du) 
Ax/mi 



< iff' N ')'(o). 



(4.4) 



In particular, (<^f' )'(0) < {(f K ^)\0) = {^(t))- 1 / 2 / A. Here r)(t) = r) K {t) is as in 
Lemma 4-2. 

(vi) ForK ER and N E [2, oo], lim y/i(<p? ,7V )'(0) = — =. 

tiO 4a/7T 

Proof, (i) It is obvious by the continuity of Ct,K,N i n (4.1). 

(ii) Note that the derivative of x( a / (2\/2m)) of any order with respect to a-variable is 
a bounded function of u for a E (0, R) in (4.1). Thus the dominated convergence theorem 
yields that ip t is smooth on (0, R). We can show the continuity of ip t on [0, R) similarly. 

(iii) Since <po = l(o,oo) by definition, it is obviously concave. Thus it suffices to consider 
the case t > 0. As we did in the proof of (ii), we can compute (p'l(a) at a E (0, R) by 
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using the dominated convergence theorem. Since x is concave, ft{a) < and hence the 
conclusion holds because ip t is continuous on [0, .R) by (ii). 

(iv) By a direct computation, we can verify ^k,n(u) < ^k',N'{u) for any u G [0, Rr,n)- 
Thus the comparison theorem for stochastic differential equations (see [13] for instance) 
yields that pK,N,aif) < PK',N',a'(t) for a! > a and t > 0. It implies (pf ,N (a) < <p t ,N (a') 

Since tpt is continuous, the asserted inequality follows by 

0, x( r )/ r is nonincreasing. Thus the monotone 



K,N i 



by the definition of tp t 
tending a' I a. 

(v) Since % is concave and x(0) 
convergence theorem yields 



lim 

<40 



K,N / \ K,N 
ft ( a ) ~ ft 



(0) 



lim 



K.N i \ 

ft W 



[0,oo) 



(t,K,N{du) 

4:\flXU 



By combining this identity with (iv), we obtain 



(t,K,N(du) 



[0,oo) 



< 



t,K\N> 



(du) 



[0,oo) 



4:\/ttu 



< 



,t,K\oo 



(du) 



[0,oo) 



< oo 



and hence the conclusion follows. 

(vi) We use the expression of {ff' N )'(0) in (v). When N — oo, it easily follows from 
Lemma 4.2. Next we consider the case K > with the expression of (t,K,N given in 
Lemma 4.4. By the definition of 9{t) in Lemma 4.3, CK/(N-i){9{t)) € (0, 1] holds for each 
t > 0. Thus we have j^{cK/{N-i){d{s)))~ 2 ds > t and therefore the dominated convergence 
theorem yields 



lim y/t(ipj 
40 



K,N\r 



(0) = lim 



1 



40 40F 



E 



ds 



t Jo ^/(^.^(^(s)) 5 



-1/2- 



40^' 



Finally, for the general K e M and iV e [2, oo), the conclusion follows from (4.4) together 
with the above-mentioned two cases. □ 



Since <£y(0) = 0, Proposition 4.5 (iii) yields the following corollary: 

Corollary 4.6 We have ip r (a + a') < Lp r (a) + ip r (a') forr > and a, a' > 0. In particular, 
for t > 0, (f t {d(-, •)) is a bounded distance function being compatible with the topology on 
M. 



Though the preparation of the proof of Theorem 2.3 is already finished in Proposi- 
tion 4.5, we will discuss further properties of ip t in the rest of this section. First we will 
study more explicit expression of (ft{a) than the one in Lemma 4.4 in the case N < oo 
and K ^ 0. Lemma 4.7 and Corollary 4.8 below study the case K < 0. Based on the 
expression of the Brownian motion on the hyperbolic space by a stochastic differential 
equation (see [27], for instance), we can show the following in a similar way as Lemma 4.3: 
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Lemma 4.7 Suppose N < oo and K < 0. Let and /3 2 (£) 6e independent, one- 

dimensional standard Brownian motions. Let S'(t) and 0'(t) be given by 



e'{t) :=exp( ^J¥Lp\ t ) + Kt 



E'(t) := Si , /(JV _ 1) (|) + V2 jf £'(s)d/3 2 (s). 
TTien 2s^ ( - 7V _ 1 ^(S / (-)/^'(-)) nas £/ie same /aw as p. 

Proof. As in the proof of Lemma 4.3, we denote Kj (N — 1) by A. We already know in 
the proof of Lemma 4.3 that s^(p(t)/2) solves the stochastic differential equation (4.3) 
with w(t) = (3(t). Thus it suffices to show that "E'(t)/Q'(t) also solves (4.3) for a standard 
Brownian motion w(t). The Ito formula yields 

= -V^^H (V2dP\t) - (A - 2)^Rdt) - 2K^dt + V2d(3\t) 
= V2d^(t) - v^2R^d^(t) - NK^&dt. 
Thus E'(t)/e'(t) solves (4.3) with w{t) = f3**{t) given by 



□ 



Corollary 4.8 Suppose N < oo and A' < 0. Tnen 



ip t (a) = E 



x 1 ^M SK/{N - 1] V 



e'isfds 







(4.5) 



where 6'(t) is as in Lemma 4-7. Moreover, 



<PtW 



oo roc 



oo JO 



*' \V {N K l)u K/{N - 1) \2 



'(N-l), Tjr , l + e 2x \ a fe x -2Kt\ du , 
x exp ( ^ — - — '-{Kt -x) = I??!—, — r I — dx. 



2u 



u N - 1 J u 



where 



e -? 2 /(2*) e -rcosh(€) 



) sin ( ^- ) 
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Proof. Let S'(t) and 9'{t) be as in Lemma 4.7. By the martingale representation 
theorem, there exists a Brownian motion B(t) such that 

E'(t) ± a KKN - X) (~)+b(2 f 
Hence, as in the proof of Lemma 4.4, the definition of <£>t(a) and Lemma 4.7 yield 



<PtW 



inf p(s) > 

0<s<t 



P 



inf E'(s) > 

0<s<t 



P 

E 



inf | s K /(at-i) + £( 



X 



2^W-D V2 



o < s < 2 y i 

* \ -1/2 



e'iuydu 







This is nothing but (4.5). Now the conclusion follows by using an explicit expression of 
the distribution of J^9'(u) 2 du in [26, Theorem 4.1] (also see references therein). □ 

In the case K > 0, we use several properties on the Gegenbauer, or ultraspherical, 
polynomials to obtain alternative expression of tp t in Lemma 4.9 below. We refer to [33] 
for basics on Gegenbauer polynomials. 

Lemma 4.9 Suppose N < oo and K > 0. Then, for all a e [0, R], 



n=0 



c -(2n + l)(2n + N)Kt/(N-l) (~l)"(4n + N + l) £ (N - 1 



vr(2n + TV) 



n + ^ ) P 2 n+i(a), 



where B(-, •) is i/ie Beta function, a := sin (y/K/(N - I) a/2) andP n (x) i/ie n-i/i Gegen- 
bauer polynomial of parameter (N — l)/2. 

Proof. Let us define by 



p(t) := sin 



2ViV-l 



K f(N-l)t 
P 



2K 



Then p(£) solves the following stochastic differential equation: 

N 



dp(t) = y/l - P (tydf3(t) - —p(t)dt, 
p(0) = a, 

where /3(t) is a one-dimensional standard Brownian motion. Thus p(t) is the Legendre 
process, or the diffusion process on (—1,1) generated by L N given as follows: 

Ln := 2 (1 - x) ^-y^- 
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It is well-known that p,^{dx) = (1 — x 2 ) N l 2 ~ l dx is the symmetrizing measure of L^. More- 
over, L N is essentially selfadjoint on L 2 (/i N ), the spectra of L N on L 2 (p N ) is {— n{n + N — 
l)/2} n6 N all of which are eigenvalues of multiplicity one, and the normalized eigenfunc- 
tion corresponding to the n-th eigenvalue is n-th normalized Gegenbauer polynomials P n 
defined by P n (x) = Z~ l P n (x) and 



1 >, 1/2 

2 , 



Pn(x) H N {dx) 



2 l ~ N l 2 ^TiT(n + N 



y/n\(n + {N-l)/2)T{(N - l)/2) ' 
As a result, the transition density pi(t,x,y) of p(t) with respect to /i at is given by 

oo 

Pi(t,x,y) = J2^ n(n+N ' 1)t/2 Pn(x)P n (y), 

n=0 



(4.6) 



where the sum converges in L 2 (p^ £g> /j<n)- We claim that the infinite sum in the right 
hand side of (4.6) converges uniformly in x and y. Recall that the Gegenbauer polynomial 
P n satisfies the following recursion relation: 



Pjx) 



-xP n _i(x) - — + — — — P n - 2 (x) 



11 



11 



(4.7) 



P (x) = l, P 1 (x) = (N-l)x. 
By induction, we can easily show that there exist Co > and q > such that 

sup \P n {x)\ < c q n . 

£•£(-1,1) 



(4.8) 



(for instance, we can dominate the left hand side by (N — 1)4" n^ =1 (l + \N — 3\/k)). It 
is not difficult to see that there exists c\ > such that Z" 1 < C\^fn for all n G N since 
N > m > 2. Then these estimates imply the claim. 
Now the reflection principle yields 



<p t (a) = P 
= P 



inf p(s) > 

0<s<t 



inf p(s) > 

0<s<2Kt/(N-l) 



Pl( 





oo „i 

= £ / e -n(n + N-i)Kt/ { N-l) ^ _ p^ty P n{x) ^ {dx) 
n=0 J ° 

oo „1 

^^e^^Df^^-Dp^^ / p* n+i{x)Mdx y 

n=0 J ° 

The Rodrigues formula for the Gegenbauer polynomial asserts 

P M _ (-2)"r(n+(iV-l)/2)r(n + iV-l) _ *, X - m *^ n _ ^n+iV/2-i 
n{) n\ r((iV-l)/2)r(2n + jV-l) 1 } dx" [ } 



2Kt 
N-l 



a,x) -pil 



2Kt 



—a, x) ) p N (dx) 
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By using this formula twice, we obtain 



P 2 n+i(x)n N (dx) 







(_2)2n+i p( 2n + ( N + ^ / 2 )r(2n + JV) 
(2n + l)! r((iV-l)/2)r(4n + iV + l) 



2n 



dx 2n 



'I — x 1\2n+N/2 



x=0 



(2n + l)(2n + N)' 



where P^n^ is the (2n)-th Gegenbauer polynomial of parameter (N + l)/2 (associated 
with L N+2 ). By the recursion formula (4.7), we obtain 

P^ +2 Vm - r ir r(n + (iv + i)/2) 

2 " lUj_l j T((iV+l)/2)n! ' 
Thus the duplication formula of the Gamma function yields 



oc 

<Pt(a) 



^ (2n+l){2n+N)Kt/(N-l) (-l)"2(iV - l)r(n + (iV + l)/2) 

^ Z| n+1 (2n + l)(2n + iV)r((iV + l)/2)n! 2 " +lW 



£j 7r(2n + iV) V 2 2j 2n+U ' 

This is nothing but the desired identity. □ 

Based on expressions of <^(a) in Lemma 4.2, Corollary 4.8 and Lemma 4.9, we will 
obtain the asymptotic behavior of y?t(a) as t — > 00 in the following corollary: 

Corollary 4.10 The following convergence holds compact uniformly in a G [0,-R): 

(i) When N = 00 and K > ; or N < 00 and K = 0, 

lim y/r) K (t)ip t (a) = —=. 
t^oo 4-\/7r 

In addition, this is an increasing limit. 

(ii) When N = 00 and if < 0, 



lim (p t (a) = x 



t— >oo \ 2 

(hi) Mien iV < 00 and if > 0, 



Ita e-'/<-'V.(«) = (— , sin f 

In addition, snp{e NKt ^ N ~^(p t (a) \t>l,ae [0,R]} < 00. 
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(iv) When N < oo and K < 0, 



lim cpJa) = 1 f°° x J Ku - sk/^u (-) ) 4 N ~^ 2 e' u du. 

^oo yn ; T N - 1/2 L A \ \ 2 N - 1 K/(N 1} \2J 



(4.9) 



Proof, (i) (ii) The convergence easily follows by elementary calculus. The monotonicity 
in t in (i) follows from the concavity of x- I n both cases, the Dini theorem ensures the 
uniformity of the convergence on each compact set. 

(iii) The computation of the limit as well as the uniformity on [0, R] and the fmiteness 
of the supremum directly follows from Proposition 4.9 and (4.8). 

(iv) By (4.5) and the monotone convergence theorem, 



lim (ft{a) = E 



t— >oo 



X 1 2~V^ SK/iN ~ 1] V2 



-1/2^ 

9'(s?ds 



(4.10) 



Then the distribution J °° 9'(s) 2 ds can be described with the aid of [26, Theorem 6.2] (also 
see references therein) to obtain (4.9). Since the convergence in (4.10) is monotone, the 
compact uniformity of the convergence follows from the Dini theorem. □ 

5 Monotonicity of transportation costs 

Based on Proposition 4.5 and Corollary 4.6, we will show some continuity properties for 
(pt(a) and %p t (d) with respect to t in the following two lemmata. 

Lemma 5.1 Let c n : M x M — > [0, oo) be a family of continuous functions converging to 
c:MxM->[0,oo) pointwisely. Let /i, v G V(M). 

(i) // su Pn,x,y °n{x, u) < oo or c n is nondecreasing in n, then 

limsup7^ n (^, v) < T c (n,v). 

n—s-oo 

(ii) If the convergence c n — )• c is uniform on each compact set or c n is nondecreasing in 
n, then 

liminf 7^(/x, v) >%{p,^v). 
Proof, (i) Under the assumption on c n , for each ir G n(/x, u), 

lim sup Tcni/J-i v) < lim sup / c n dn= / cdir. 

rwoo n->oo JMxM J MxM 

Thus the assertion holds by taking infimum over tt G n(/i, u). 
(ii) Take a subsequence (c nk )k of (c n ) n so that 

, lim %„ (//, v) = lim inf T Cn (jjl, v). 
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Since H(fi, v) is compact and c n is continuous and nonnegative, a usual variational argu- 
ment implies that there is a minimizer of 7^ nfc (yU, z/), i.e. there exists tt^ £ n(/i, v) such 
that T c „ k (/i, v) = f MxM c nk d/Kk- We may assume that iik converges as k — >■ oo by taking 
a subsequence if necessary. We denote the limit by n^. 

First we consider the case that c n converges to c compact uniformly. Take e > and 
choose a compact set K C M x M such that TTk(K c ) < e. Then, for any R > 0, the 
assumption on c n implies 



lim T c (//, i/) > liminf / c nk A -Rg^ 

k— >oo K k—>oo J 



> lim inf / c A R dirk — e 

> liminf / c A Rdn k - (R + l)e. 



k— >oo 



M 



= / cARdn^ - (R+l)e. 
Jm 

Thus, by taking e I and i? t oo, we obtain 

lim T Cni (/i, u) > / cdTToo > 7^(/i, f ) 

fc->oo 7 M 



and hence the assertion holds. 

Next we consider the case that c n is nondecreasing in n. Then, for fceN, 

c nfc c^oo < lim inf / c nk 



diii < lim inf / c ni dni = lim inf 7~ Cn (/i, v) 
By taking k — > oo, the monotone convergence theorem implies that 



MxM Z_K3 ° JMxM 



%(/J>, v) < lim sup / c nk diioc,. 

fc->oo JMxM 

Thus, the conclusion follows by combining these two estimates. □ 

For later use, we will state the following lemma in a slightly more general form than 
what we will use in the proof of Theorem 2.3. 

Lemma 5.2 Let (fi s )se[o,oo) and (^ s ) se [o,oo) be families of probability measures on M which 
is continuous in s with respect to the topology of weak convergence. For t > 0, let d : 
[0, t] x M x M — > [0, oo) be a continuous function such that d(s, •, •) is a distance function 
on M for each s £ [0, t]. Then s t— > T Lpt _ i^i s . a)(^ s , v s ) is continuous on [0,t) and lower 
semi- continuous at t. 

Proof. For simplicity of notations, we denote d(s,x,y) by d s (x,y). Let so £ [0,t) and 
take a decreasing sequence (s n ) ne N with lim^oo s n = s . Let e > 0. Since (fi Sn )neN an d 
(^„)n€N are tight in V(M), there exist a compact set K C M such that tt((K x K) c ) < e 
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for any n G IJneN n(/-t s „; u s n )- Since (p t - Sjl (a) is nonincreasing in n, the Dini theorem yields 
that ip t - Sn (d S0 ) converges to (p t - So (d So ) as n — > oo uniformly on K x K. By Corollary 4.6, 

\ft-s n (d Sn (x,y)) - tp t - Sn (d S0 (x,y))\ < tp t „ Sn (\d Sn (x , y) -d SQ (x,y)\). 

By the assumption on d s , we have 

lim sup \d Sn (x,y) - d ao (x,y)\ = 0. 

By combining these estimates, we obtain 

lim sup \% t _ Sn ( dan )(Vsn,Vs n ) - % t - S0 {d S0 ){Hs n ,v Sn )\ 



<limsup sup \(pt- Sn (d S0 (x,y)) - (p t -s (ds (x,y))\ 

n— >oo \ x,yGK 

+ sup ift-sx (\d Sn (x,y) - d S0 (x,y)\) ) + e 

x,y£K J 

= e. (5.1) 

By virtue of Corollary 4.6 and [34, Theorem 7.12], the weak convergences fi Sn — > fi so and 

->■ ^Pty that %>s {d S0 ){Vs n ,Vs n ) converges to 7^ ( dao )(/i So , ^ )- By combining this 
fact with (5.1), we obtain 

It proves that %, t _ s (d s ){fJ>s, v s ) is right-continuous at sq. In a similar way, we can show the 
left-continuity of T ipt _ a (d s )(^s, v s ) at sq. Finally we will show the lower semi-continuity at 
t. Since (p r (a) is nonincreasing in r, for t' > t, we have 

lirninf7; t _ s (d s )(^ s ,^) > lim% t ,_ s{ d s ){Hs, v,) = % t ,_ tidt )(Ht,Vt)- 

Hence the conclusion follows from Lemma 5.1 (ii) by letting t' It. □ 

Proof of Theorem 2.3. Recall that, for t' > 0, s' > and x 1 ,x 2 G M, Theorem 2.2 
yields 

% Ad) (F Xl o X(s')-\W X2 ° X(s')- 1 ) < <P*+Ad(z l7 x 2 )). (5.2) 

Let < si < s 2 < t. For each y ll y 2 G M, take nf 2 m Sl G II(P W o!(s 2 - si) -1 , P M o X(s 2 - 
si)" 1 ) so that 

r vt . S!M (P B oI( S2 - s^ 1 ^ oX{s 2 - s x y l ) = [ ip t . S2 (d) cZtCV 

JMxM 

We can choose ^1^ S1 so that (yi,y 2 ) H- 7r^ 2 si is measurable (see [35, Corollary 5.22], 
for instance). Let us take a minimizer ir G II(^ , /xi^) of T ipt _ s (d)(/J>si , l*s\) and define 
7T* G n^ff,^?) by 

JMxM 
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Then, by applying (5.2) with t' = t — s 2 and s' = s 2 — s±, we obtain 



ipt- S2 {d)\ 



M x M 



MxM 



<pt-s 2 (d)\ lr yi 



o X(s 2 - s i y 1 ,F y2 o X(s 2 - s l )- 1 )Tr(dy 1 dy 2 ) 



< 



<Pt-a 1 (d(y 1 ,y 2 ))ir(dyidy2 



MxM 



' L Pt-s 1 (d) KH'si i rsi )• 

Thus the assertion holds when t > s 2 . When t = s 2 , the assertion follows by taking s 2 1 1 
together with Lemma 5.2 with d(t,x,y) := d(x,y). □ 

In Theorem 2.3, the cost function tp t - s {d) depends on time parameter t. Thus it seems 
to be natural to consider the limit t — > oo, under a suitable scaling if necessary. We can 
realize it by combining Corollary 4.10 with Theorem 2.3 with the aid of Lemma 5.1. Then 
we obtain the following monotonicity of transportation costs: 

Corollary 5.3 Let us define 6 = Q k ,n ■ [0, R) ->■ [0, oo) and n = k(K, N) el by 



©R:,jv(a) := < 



a 
a 

X 

sin 



aV^K 



k(K, N) 



2 
V 



A' V 
NK 



K 



N 



-Ku 



N 



2{N - 1 / 
(JV = oo), 
V0 (iV<oo). 



SK/(N-l) 




{K = 0), 

(N = oo and K > 0), 
(N = oo and K < 0), 

(N < oo and K > 0), 
u (7v-3)/2 e - Mrfu (AT < oo and K < 0), 



Fori = 1,2 and /x^ G V(M), let /if 5e i/ie distribution of X(t) with the initial distribution 
yfi. Then e KS Te(d)(^\ A*s ) ^ s nonincreasing in s. 

When K = or N = oo and A > 0, what the last corollary states is nothing but the 
L 1 -Wasserstein contraction. When N < oo and K > 0, what we obtained is essentially 
well-known (see [37] for the statement formulated in terms of optimal transport theory). 
Thus the most interesting assertion is in the case K < 0. In the usual L p -Wasserstein 
contraction in (1.2), The upper bound grows exponentially fast as time increases when 
K < 0. The last corollary says that a nonincreasing property still holds even when K < 
by choosing a cost function appropriately. 
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6 Gradient estimates 



For a bounded and measurable function / : M — > R, we define the action of the diffusion 
semigroup P t f by P t f(x) := E x [f(X(t))]. We denote the dual action of P t to V(M) by 
P*. That is, 

PZli{A)= [ F x [X(t) e A] fi(dx). 

J M 

Since (p t (d(x,y)) is a distance function by Corollary 4.6, the Kantorovich-Rubinstein du- 
ality easily implies the following (cf. [21,30]): 

Theorem 6.1 Given t,s > 0, the following are equivalent: 

(i) For fi u /2 2 eV(M), 

(ii) For any (p^' N (d)-Lipschitz function f on M, 

x+y ip t + s (d(x,y)) xft<p t ' (d(x,y)) 

The condition (i) in the last theorem comes from the consequence of Theorem 2.3. Note 
that, in the condition (ii), those supremums may be attained at (x,y) G M x M with 
d(x, y) > since ipt(d) is not a geodesic distance. As an easy consequence of Theorem 6.1, 
we obtain the following gradient estimate. 

Corollary 6.2 Under Assumption 1, we have 

||VP t /|U<¥/ t (0)osc(/) 

for any bounded measurable function f on M . 

Recall that an expression of ip' t (0) is given in Proposition 4.5. Note that a gradient 
estimate like in Corollary 6.2 also follows from the reverse Poincare inequality (see [4,22] 
for instance; also see [2]). When K > 0, Corollary 6.2 and Proposition 4.5 (v) easily imply 
the Liouville property, that is, there are no nonconstant bounded £-harmonic functions, 
by taking / as a bounded harmonic function (so that Ptf = /) and t — > oo. 

Proof. Theorem 2.3 tells us that the condition (i) holds with t — under Assumption 1. 
Then the definition of (fo yields 



\f{x)-f{y)\ 

SU P K,N (M 7T = 0SC(/). 

°¥v <Po {d{x,y)) 

Recall that the differentiability of f^ ,N at is given in Proposition 4.5 (v). Thus Theo- 
rem 6.1 implies that 

1 IIV7P ,n , \P s f(x)-PJ(y)\ ^ 

, KNs„ n s oo < SUp — < OSC (/) 

and hence the conclusion holds. □ 
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7 Stability under the Gromov-Hausdorff convergence 



In this section we consider a sequence of Riemannian manifolds (M n ,g n ) (n G N). By 
technical reasons, we restrict ourselves into the case that each M n is compact. Let /„ be 
a positive C 1 -function on M n and Z n := V/ n for n G N. Suppose that, given N < oo and 
K, (M n ,g n ) and Z n satisfies Assumption 1 for all n G N where the parameters N and K 
are independent of n. Let vol n be the Riemannian volume measure on (M n ,g n ) and set 
v n = e^"vol n . Under Assumption 1, the metric measure space (M n ,d n ,u n ) satisfies the 
curvature-dimension condition CD(K, oo) (see [25,31,32]). Thus the gradient flow of the 
relative entropy functional Ent„ n on L 2 -Wasserstein space over (M n , d n , u n ) is identified 
with the gradient flow of the Dirichlet energy functional on L 2 (M n , u n ) (see [1,8, 11]). 

Definition 7.1 (i) Let (Mi,d\) and (M 2 ,c?2) be metric spaces. For e > 0, we call 
f : Mi — > M 2 an e-isometry if the following hold: 



(ii) Let ((M n , d n , f n ))neN and (M,d,v) be metric measure spaces. We say (M n ,d n ,u n ) 
converges to (M, d, i>) as n — >• oo in the measured Gromov-Hausdorff sense if there 
exist e n > (n G N) with lim n _ > . 00 £ n = and e n -isometry f n : M n — > M so that 
fn v n converges to f*v in the vague topology. 

In the sequel, we assume that (M n , d n , u n ) converges to a compact metric measure space 
(M, d, v) in the measured Gromov-Hausdorff sense via e n -isometries f n : M n — > M. Note 
that, in this framework, the convergence with respect to the measured Gromov-Hausdorff 
distance is equivalent to the convergence with respect to the distance D introduced in [31] 
(see [31, Subsection 3.4]). Under the assumption, (M, d, v) satisfies CD(K, oo) again 
(see [2,25,31]). Thus, for /i G V(M) with Ent u (fi ) < oo, there exists a unique gradient 
curve fit of Entj, on V{M) starting from /x (see [1,2, 10]). 

The following theorem asserts that these gradient curves enjoy the same monotonicity 
as shown in Theorem 2.3: 

Theorem 7.2 For i = 1,2, let /jLq G V(M) with Ent„(yUQ^) < oo and /4 a gradient 
curve of Ent u with initial distribution [jff. Then, for any t G [0, oo), 7^ t _ s (d)(jUs , fffl) is 
a nonincreasing function of s G [0,t\. 

Proof. By virtue of Lemma 5.2, it suffices to show the assertion for s G (0,t). For 
i — 1,2, there exists fi^'^ G V(M n ) for n G N such that Ent I/n (/iQ ,n ^) < oo and that 
/rf/ i o ' n ' ) converges to /jLq by following an argument in the proof of [25, Theorem 4.15]. 
Let fit n ^ be the gradient curve of Ent^ n on V(M n ) with the initial distribution fi^^. 
Then, by virtue of [10, Theorem 21], f^l^t converges to fif 1 for 2 = 1,2 and t > 0. 
We claim that for each s G (0,t), 



sup |di(x,?/) - d 2 (f(x),f(y))\ < £, 



sup d 2 (y,f(M 1 ))<£. 



x,y£Mi 



yeM 2 



iim (r, t _ sK) (/i^ n ^M 2 '"))-r, ^ _ s{d) (/M' n ^/n # M 2 ' n) )) =o. 



(7.1) 
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Take n {n) 6 n(/i 1,n) , /ii 2,n) ) and set vf (n) := (/„ x f n )*ir {n) . Then we can easily see that 
^ II(/#)Us n \ /* fi?' n ^) holds. Since / n is e n -isometry and V?t-s(') is nondecreasing, 

% t _ M {f*^ n \f*^ n) ) < I w- s (d(x,y))^ n \dxdy) 

JMxM 

= [ <p t - a (d(f n {x),fM))nW{dxdy) 

JMxM 

= / ip t - a {d n {x,y) + e n )^ n \dxdy). (7.2) 

JMxM 

Since our choice of ir^ e , /i s 2 ' n ^) can be arbitrary, (7.2) and Corollary 4.6 yield 

r^_ s (4/M'^/n^^ (7-3) 

To complete the proof of the claim, let us take an approximate inverse g n for each n G N, 
that is, g n : M — > M n satisfies 

lim sup d(x,g n (f n (x))) = 0, lim sup d n (x, f n (g n (x))) = 0. 

We may assume g n is e^-isometry for some e' n with lim^oo = without loss of gener- 
ality. By a similar argument as what we used to obtain (7.3), 

% t _ s{dn) ({g n o f n )*^ n \ (g n o f n )*^ n) ) 

< % t _ M {f*^ n \ f*^ n) ) + ipt-s{e' n ). (7.4) 

Since (id x (g n o f n ))#^ n) e H(/iP } , (g n o /„)#^' n) ), 

% t _ s(dn) {^' n \ (g n O < [ d n (x j9n (fn(x)))^ n \dx) 

J M n xM n 

< sup d n (x,g n (f n (x))) 

xeM n 

for % = 1, 2. By combining this estimate with (7.4), we obtain 

l<p t ^ s (d n ) \H> S J A's J 

< % t _ M {f*^ n \f*^)+2 sup d„(x,^(/ n ,(x))) + ^_ s (4). (7.5) 

x£M n 

Hence (7.3) and (7.5) imply the claim since ipt- s (') is continuous. 

By Corollary 4.6 and [34, Theorem 7.12], %, t _ s (d)(fn ^' n \ fi^) converges to as n — > 
oo for i = 1,2. Hence (7.1) yields lim^^ 7^ t _ s ( dn )(/ii 1,n) , nf' n) ) = % t _ s {d){l^s\ Us). 
Since V.wi''* i' 1 ' ') * s nonincreasing in s by Theorem 2.3, the conclusion holds. □ 
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8 Time-dependent metrics 



Let [g{t))t£[Ti,T 2 ] be a family of smooth complete Riemannian metrics on M depending 
smoothly in t. Let Z(t) be a time-dependent vector field on M depending continuously in t 
and consider the time-inhomogeneous diffusion process ((X(i)) te [ Tl T2 ], (¥ x ) x£ m) generated 
by Ct := A g M + Z(t). The following assumption corresponds to Assumption 1 with 
N = oo: 

Assumption 2 Given K e M, the following holds for each t: 

(VZ(t)f + l -d t g(t) < Ric 9W -Kg(t). 

An important example of the time-dependent metrics g(t) satisfying Assumption 2 is the 
backward Ricci flow, that is, 

-d t g(t) = Ric s(t) . 

Under Assumption 2, the coupling by reflection of X(t) is already studied in [17] via 
the approximation by geodesic random walks (The notation in [17] looks slightly different 
since we considered the diffusion process generated by A g ^/2 + Z{t) there). By modifying 
arguments in previous sections, we can obtain the results corresponding to Theorem 2.2, 
Theorem 2.3, Corollary 2.4 and Corollary 5.3 with N = oo by replacing d which measures 
the distribution at s with d g rr 1+s ). For example, the conclusion of the statement corre- 
sponding to Theorem 2.3 is as follows: Let (i s be the distribution of X(t) at t = s + T\ 
with initial distribution (ij^ for z = 1,2. Then, for t > S2 > si > 0, 

T f (1) P) ^ <• T I (!) ( 2 ) \ /oi\ 

For reader's convenience, let us explain briefly why the time derivative with respect 
to the metric appears in Assumption 2. When we follow the argument in the time- 
independent metric case in Proposition 3.1, we consider d g u^)(X- a (t'^)) instead of r a (n) = 
d(X. a . Then, in the Taylor expansion in the proof of Lemma 3.2, there appears the 
time derivative of d g M as an additional term. It can be described in terms of the time 
derivative of g(t). Then our condition in Assumption 2 will be used to implement this 
additional term into the lower bound of Bakry-Emery Ricci tensor. For more details, 
see [17]. 

Note that, by [17, Lemma 2.5], d(t,x,y) := d g (t+Ti){%,y) satisfies the assumption of 
Lemma 5.2. This fact will be used to complete the proof of (8.1) when t = s 2 - 

Acknowledgment. The first named author is grateful for the support by the Grant-in- Aid 
for Young Scientists (B) 22740083. 
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